-
-
Notifications
You must be signed in to change notification settings - Fork 10.8k
Feature/video support in random mm dataset #25963
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Feature/video support in random mm dataset #25963
Conversation
Signed-off-by: Eugene Khvedchenia <[email protected]>
Signed-off-by: Eugene Khvedchenia <[email protected]>
Signed-off-by: Eugene Khvedchenia <[email protected]>
Signed-off-by: Eugene Khvedchenia <[email protected]>
Signed-off-by: Eugene Khvedchenia <[email protected]>
Signed-off-by: Eugene Khvedchenia <[email protected]>
…hen generating random inputs (This is to avoid inserting mm-related tokens which may confuse VLM models) Signed-off-by: Eugene Khvedchenia <[email protected]>
Signed-off-by: Eugene Khvedchenia <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Signed-off-by: Eugene Khvedchenia <[email protected]>
Signed-off-by: Eugene Khvedchenia <[email protected]>
Signed-off-by: Eugene Khvedchenia <[email protected]>
Signed-off-by: Eugene Khvedchenia <[email protected]>
Signed-off-by: Eugene Khvedchenia <[email protected]>
Signed-off-by: Eugene Khvedchenia <[email protected]>
|
I see you generate a temporary mp4 file, dump the video into it, read into bytes, and then send it in base64 encodings. I suspect that passing the reference to the temporary file in the payload rather than base64 encoding would speed up inference. By building on top of your code we could easily answer this hypothesis with hard facts 👍 |
What my action points in that regard should be? |
|
I don't think you need to change this PR to enable the comparison I mentioned. Its only a potential follow-up. |
Signed-off-by: Eugene Khvedchenya <[email protected]>
|
I'm going to turn on ready label so that you can see if the benchmark tests pass. |
…de specific tokens Signed-off-by: Eugene Khvedchenia <[email protected]>
Signed-off-by: Eugene Khvedchenia <[email protected]>
Allow benchmarking models using
random-mmdataset with video inputsPurpose
Can now do this:
vllm bench serve \ --backend openai-chat --endpoint /v1/chat/completions \ --dataset-name random-mm --num-prompts 256 \ --model nvidia/Cosmos-Reason1-7B \ --max-concurrency 32 \ --random-prefix-len 0 \ --random-input-len 30 \ --random-output-len 128 \ --random-mm-base-items-per-request 1 \ --random-mm-num-mm-items-range-ratio 0 \ --random-mm-bucket-config '{(512, 512, 16): 1.0}' \ --request-rate inf \ --ignore-eos \ --seed 42